2 research outputs found

    Parallel computation of the hyperbolic QR factorization

    Get PDF
    U ovom radu prezentirali smo kako računati hiperboličku QR JJ-faktorizaciju. Prvo je postavljena teorija koja nam daje dva načina redukcije matrice GCm×n,mn,G \in \mathbb{C}^{m \times n}, m \geq n, na blok gornjetrokutastu formu. Jedan način je redukcija jednog stupca pomoću JJ-Householderovog reflektora. Razjašnjeni su nužni i dovoljni uvjeti postojanja takvih operatora. Drugi način je redukcija dva stupca koristeći Givensove rotacije. U tom poglavlju je obrađeno što sve zovemo pravilnom (’proper’) formom, kako svesti matrice na pravilnu formu, te kako tu pravilnu formu do kraja reducirati JJ-unitarnim matricama manjih dimenzija. Nadalje, indefinitni QR povezali smo sa još jednom faktorizacijom, hermitskom indefinitnom faktorizacijom. Pokazali smo kako su te dvije faktorizacije povezane i koja je optimalna strategija pivotiranja u hermitskoj indefinitnoj faktorizaciji (ona koja ima najmanji pivotni rast u svakom slučaju izbora strategije, nebitno biraju li se dva ili jedan stupac za redukciju). Ista pivotna strategija primijenjena je i na QR faktorizaciju. Naposljetku, prezentiran je sekvencijalni algoritam redukcije matrice GG na gorenje blok trokutastu formu, kao i njegovi dijelovi koji su paralelizirani. Pri optimizaciji koda, u obzir je uzeta i arhitektura memorije računala, te način funkcioniranja biblioteka OpenMP i MKL koje smo koristili za paralelizaciju. Testiranja na umjetno generiranim matricama izvedena su na Intelovom Xeon Phi 7210 računalu, gdje je također u obzir uzeta posebna memorijska arhitektura računala.In this thesis, a way of computing a JJ-unitary QR factorization was presented. The theoretical part was set first, in order to explain two possible ways of transforming a matrix GCm×n,mn,G \in \mathbb{C}^{m \times n}, m \geq n, into a block upper triangular matrix. One way to do this is with JJ-unitary Householder like reflectors. The necessary and sufficient conditions for their existence were formed. The other way to do this is using Givens rotations. In that chapter, a term proper form was defined, how to transform matrices into proper forms and, in the end, how are those proper forms fully reduced with JJ-unitary matrices of smaller dimensions. Furthermore, we showed how indefinite QR is connected to the Hermitian indefinite factorization. An optimal pivoting strategy for the Hermitian indefinite factorization was presented, based on minimizing the pivot growth (regardless of the fact that one or two columns were chosen as pivotal). The same strategy was then used in the QR factorization. At last, a sequential version of the algorithm for reducing the matrix GG to a block upper triangular form was presented, as well as with the parallelised segments of it. The memory architecture was took into account while optimizing the code as well as the optimal usage of OpenMP and MKL libraries. The tests on randomly generated matrices were performed on Intel’s Xeon Phi 7210. The special architecture of Xeon Phi was also taken into account

    The LAPW method with eigendecomposition based on the Hari--Zimmermann generalized hyperbolic SVD

    Full text link
    In this paper we propose an accurate, highly parallel algorithm for the generalized eigendecomposition of a matrix pair (H,S)(H, S), given in a factored form (FJF,GG)(F^{\ast} J F, G^{\ast} G). Matrices HH and SS are generally complex and Hermitian, and SS is positive definite. This type of matrices emerges from the representation of the Hamiltonian of a quantum mechanical system in terms of an overcomplete set of basis functions. This expansion is part of a class of models within the broad field of Density Functional Theory, which is considered the golden standard in condensed matter physics. The overall algorithm consists of four phases, the second and the fourth being optional, where the two last phases are computation of the generalized hyperbolic SVD of a complex matrix pair (F,G)(F,G), according to a given matrix JJ defining the hyperbolic scalar product. If J=IJ = I, then these two phases compute the GSVD in parallel very accurately and efficiently.Comment: The supplementary material is available at https://web.math.pmf.unizg.hr/mfbda/papers/sm-SISC.pdf due to its size. This revised manuscript is currently being considered for publicatio
    corecore